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II  Introduction 


In  this  pspsr  ve  propose  general  paraneter  estimation  techniques  to 
be  used  In  modeling  o i  transport  phenomena.  While  the  fundamental  Ideas 
we  discuss  have  proved  quite  successful  In  other  areas  of  application 
(elasticity,  seismology,  enzyme  column  reactors,  etc.— see,  for  example, 

[2],  [3],  [4],  [6],  [9]),  our  emphasis  here  Is  a  class  of  transport  equations 
arising  In  biology.  We  treat  models  in  which  certain  of  the  coefficients 
to  be  estimated  are  spatially  varying.  The  methods  are  valid  in  more 
complex  transport  problems  with  coefficients  that  vary  both  temporally  and 
spatially  (see  [6], [7]),  but  the  underlying  theory  is  technically  somewhat 
different  and  more  complicated. 

We  begin  in  section  2  by  formulating  an  "identification"  or  parameter 
estimation  problem  Involving  a  general  transport  equation.  Although  for 
co-  .ience  In  exposition  we  treat  only  scalar  equations,  the  Ideas  and 
convergence  results  easily  extend  to  vector  systems.  Indeed,  in  a  number 
of  problems  we  have  successfully  employed  the  techniques  for  coupled  systems 
of  equations. 

In  section  3,  we  then  reformulate  the  estimation  problem,  putting  it  In 

V 

abstract  form  for  concise  development  of  our  ideas.  This  abstract  problem 
is  approximated  by  a  sequence  of  estimation  problems  in  section  4  shore  we 
give  convergence  results  for  states  and  parameters.  We  also  explain  our  use 
of  computational  packages  to  solve  the  estimation  problems.  In  section  5, 
we  apply  the  methods  to  two  examples: 

(1)  estimation  of  diffusion  and  bulk  flow  parameters  for  the  transport 
of  substances  in  tissue  (specifically,  for  transport  of  sucrose  in 
cat  brain  white  matter) 
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(11)  estimation  of  spatially  varying  coefficients  In  population 

dispersal  models  (in  this  case,  nodels  of  flea  beetle  aoveaent 
within  linear  arrays  of  collard  patches) . 

As  we  shall  explain  in  our  concluding  remarks,  the  methods  we  have  developed 
have  yielded  satisfactory  results  and  insight  into  biological  systems.  He 
are  confident  that  other  investigations  concerning  transport  aodels  in  biology 
could  also  be  served  by  the  application  of  our  parameter  estimation  methods. 

In  our  discussions  in  sections  3  and  4,  it  will  be  convenient  to  eaploy 
notation  from  elementary  functional  analysis.  Although  this  notation  will 
be  quite  familiar  to  our  more  mathematically  trained  readers,  we  summarise 


some  of  it  briefly  for  others.  By  I^tt),!)  we  shall  mean  the  standard  Hilbert 


'Che  inner  product 


space  of  "functions"  f  defined  on  (0,1)  with  [  <  *».  'Che  inner  product 

f1  0  1 

in  this  space  is  <f,g>  *  fg.  He  denote  by  H  the  usual  Sobolev  space 

J0 

of  functions  f  in  I 1^(0, If  with  first  derivatives  Df  in  L2(0,l)*f  the  subspace 
of  functions  1.'  H*  vanishing  at  x  •  0  and  x  *  1  is  denoted  by  H^.  More 
generally,  will  be  the  Sobolev  space  of  functions  having  derivatives  up 
to  order  J,  with  the  jth  derivative  D^f  in  Lj. 

The  usual  space  of  essentially  bounded  functions  is  denoted  by  L„ 
with  the  norm  being  given  by  «  ess  sup|g(x) | .  The  space  H^  is  the 

space  of  functions  f  having  j  derivatives  with  l^f  in  L^. 

He  shall  use  the  symbol  | • |  to  denote  a  norm  in  most  situations. 


However,  In  some  discussions,  for  clarity  we  resort  to  the  more  cumbersome 
notation  of  |*|2,  |‘|a  a«c.  representing  none  in  L^,  La  etc. 
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82  A  fundamental  estimation  problem 

We  begin  by  considering  the  general  transport  equation  (based  on  mass 
balance  laws)  given  by 

'<»  »  *  h <Vu)  ■  4  «*•*■>  »i«f. 

t  >  0. 

Here  the  tera  Involving  V  represents  a  general  "directed  movement"  mechanism 
such  as  convection  In  tissue  transport  models  or  attractlve/chemotactic 
phenomena  In  population  dispersal  models.  We  shall  assume  that  in  general 
this  "velocity"  V  Is  spatially  varying,  i.e.,  V  ■  V(x).  The  first  term  on 
the  right  In  equation  (1)  Is  a  result  of  the  usual  Tick's  first  law  of 
diffusion  and  we  shall  assume  for  our  presentation  that  the  coefficient  of 
diffusion^  is  constant.  The  last  term  in  (1)  represents  general  sink/ source 
mechanisms  (death/birth,  reaction,  etc.)  that  might  be  present.  Of  course, 
u  represents  the  concentration  (In  tissue  transport)  or  population  density 
(la  species  dispersal)  that  is  of  primary  interest. 

We  assure  that  along  with  (1)  certain  Initial  conditions  u(0,x)  *  <Kx) 
and  Dirichlet  boundary  conditions  u(t,0)  *  8q(1)»  u(t,l)  =  g^(t) 
are  given  (or  perhaps  must  be  estimated — both  situations  arise  in  the  situations 
discussed  below).  Using  a  standard  transformation  of  variables 
(u  *  u-(l-x)gg-xgj) ,  one  can  transform  the  resulting  Initial-Boundary  Value 
Problem  (IB VP)  for  (1)  into  an  IBVP  with  homogeneous  boundary  conditions. 

We  assume  that  this  has  been  done  and  furthermore  we  make  the  assumption 
throughout  that  f  is  linear  in  u.  (This  linearity  assumption  is  not  at  all 
essential  for  the  methods  discussed  below  to  be  valid,  but  it  greatly  simplifies 
our  presentation— for  a  discussion  of  estimation  results  concerning  problems 
with  rather  general  nonlinearltles  in  f,  see  [5],  [8].) 
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After  a  change  of  variables  in  (1)  and  some  elementary  manipulations 
one  obtains  the  transformed  version  of  the  TBVP  that  will  be  central  to  our 
presentation  (we  drop  the  ~  on  the  transformed  variable  u) : 


n  =  qr~T  +  Vx)fe  +  q3(x)u  +  g(t’x) 
3x 


0  <  x  <  1, 


(2) 


u(t,0)  =  u(t,l)  =  0, 


t  >  0, 


u(0,x)  =  v(x),  0  i.  x  £  1* 

Given  observations  of  a  general  biological  process  that  can  be  represented 
by  equation  (2),  our  task  is  to  use  observations  of  u  to  determine  the 
positive  constant  q^  and  the  bounded  functions  and  q^.  More  precisely, 
we  have  observations  {y^(x)|0  <.  *  £  D  at  times  t^  >  0,  i  ■  l,  ...  ,  m, 
and  a  given  set  Q  c  R  x  L^CO.l)  *  [^(0,1)  of  admissible  parameters 
(  cl1»tl2’q3^  -  over  we  wish  to  minimize  the  fit-to-data  criterion  function 

(3)  J(q)  =  I  fVu-.x)  -  y.(x)|2dx 

i=l  JO  1  1 

where  u(t^,x)  *  u(t^,x;q)  represents  the  solution  of  (2)  corresponding 
to  q  *  ^i’^2’q3^  ’ 

In  many  practical  situations,  one  has  discrete  data  y^  for  u  at  points 
(t^t  Xj )  and  in  applying  our  methods  one  might  Instead  of  (3)  employ  the 
alternate  criterion  function 


(*)  J(q)  *  l  |u(t.,x.)  -  y..|2. 

i.j  1  3  13 

Indeed  we  did  this  in  both  the  specific  applications  discussed  below.  (One 
could,  of  course,  use  (3)  with  discrete  data  y^  by  first  constructing — say 
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by  Interpolation — an  observed  "data  function"  y^x)  from  the  discrete 
observations  . )  The  convergence  theory  for  use  of  our  methods  with 
criteria  of  the  form  J  is  somewhat  more  delicate,  requiring  one  to  establish 
a  polntwlse  in  x — as  opposed  to  — convergence  of  approximating  solutions. 

He  therefore  shall  restrict  our  discussions  of  the  theoretical  aspects  of 
our  method  to  the  "continuous  data"  experimental  situation  which  results  in 
a  problem  of  minimizing  (3)  over  Q  subject  to  (2). 

The  ideas  behind  our  methods  are  most  succintly  discussed  in  the  context 
of  an  abstract  form  of  (2),  which  we  formulate  in  the  next  section. 

S3  An  abstract  estimation  problem 

He  first  rewrite  (2)  as  an' Initial  value  problem  in  the  state  space 
Z  ■  I^O,  1).  Letting  z(t)  *  u(t,*)  and  G(t)  -  g(t,*)  denote  time  varying 
functions  with  values  in  Z,  we  can  formally  write  (2)  as  the  system 

(5)  z(t)  «  A(q) z(t)  ♦  G(t)  t  >  0, 

z(0)  -  T 

where  the  operator  A(q)  is  defined  on  Dom(A(q))»H2  HhJ  by  A(q)  ♦■q^fw^DA+q^d. 

(Here  and  below  the  operator  D  is  differentiation,  l.e.  D  ■  The  equation 

under  consideration  can  thus  be  viewed  as  an  ordinary  differential  equation 

in  the  state  space  Z.  The  analogue  to  the  usual  (in  the  case  A  is  a  matrix 

At 

operator)  solution  operator  e  is  in  this  case  the  solution  semigroup  (a 
one  parameter  family  of  operators  on  Z)  associated  with  (5)  which  we  shall 
denote  by  S(t)  (or  S(t;q)  if  we  wish  to  emphasise  the  dependence  on  q).  Thus 
solutions  to  (S)  with  G  s  0  are  given  by  z(t)  *  S(t)z(0)  ■  S(t)T  and  a 
corresponding  "variatlon-of •constants"  formula  can  be  used  to  define  solutions 
of  (5)  in  the  case  that  G  is  nontrivial.  He  summarise  results  for  (5)  in  the 
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following  lemma  (for  arguments  see  Appendix  1  and  for  detailed  discussions 
on  semigroups  and  abstract  differential  equations  see  [1].  [15]). 

Lemma  3.1.  Suppose  Lm.  Then  the  operator  A(q)  In  (5)  satisfies 

the  dissipative  Inequality  <A(q)z,z>  <_u><z,z>  where  u  =  -i| f>q2 {^ | <?3 - 
Furthermore  A  generates  a  C0-semigroup  (S(t)}  satisfying  ||S(t)||  <_ 
and  (mild)  solutions  to  (5)  are  given  (for  G  lntegrable)  by 

(6)  z(t;q)  =  S(t;q)¥  +  (  S(t-o;q)G(o)da. 

•'0 

In  view  of  the  system  reformulation  just  presented,  we  recast  our 
basic  parameter  estimation  problem  as  one  of  minimizing  over  Q  the  functionax 

m  2 

(7)  J(q)  *  l  |z(t.;q) -yJ 

i*l  1  1  2 

where  y^  la  the  observation  function  at  time  as  introduced  in  discussing 
(3)  above  and  z(t^;q)  is  defined  by  (6). 

This  problem  clearly  consists  of  minimizing  a  functional  defined  via  a 
state  equation  in  the  infinite  dimensional  space  Z  “  LjCO,  1).  Any  type  of 
computational  procedure  must  be  founded  on  some  type  of  approximation  to  the 
equation  (6).  We  turn  next  to  one  class  of  approximation  schemes  which  have 
proved  both  theoretically  and  computationally  sound. 


7 


14  Approximations  of  the  estimation  problem 

We  approximate  the  estimation  problem  Involving  (6),  (7)  by  a  sequence 
of  problems  defined  on  subspaces  N  1  1,  2,  ...  ,  of  Z.  Numerous  possi¬ 
bilities  abound,  but  the  schemes  we  discuss  here  have  proved  most  useful  in 
not  only  the  biological  applications  presented  below,  but  also  in  a  number 

U 

of  other  areas  as  indicated  in  the  introduction.  The  subspaces  Z  we  choose 

are  those  generated  by  cubic  spline  elements.  Full  details  on  similar 

problems  are  given  in  [5]  but  for  the  convenience  of  readers  and  the  sake 

of  completeness,  we  briefly  sunmarlze  the  fundamentals.  Let  Bq,  B^,  ..., 

2 

denote  the  cubic  B-spline  elements  (piecewise  cubic  C  (0,1)  functions 

corresponding  to  the  partition  =  {x^},  =  j/N,  j  =  0,1,..., N, 

of  [0,  1]  — see  [16,  p.  208-209])  modified  to  satisfy  the  boundary  conditions 

Bj  (0)  Bj(D  ■  0.  Define  ZN  ■  span  ■  In  the  usual  notation, 

ZN  *  S3(AN)  =  {*£  S3(AN)|<K0)  =  *(1)  =  0}  Where  S3(AN)  =  €  C2(0,1)|<|, 

is  a  cubic  polynomial  on  each  interval  [x^,  xi+^] 1 .  Explicit  formulae  for 
3  N 

basis  elements  for  S  (A  )  can  be  given  ([16,  p.  89],  [3])  and  these  can  be 

used  to  give  analytical  expressions  for  the  modified  basis  elements  6.  (e.g., 

see  [5,  p.  10],  [3,  p.  12]). 

N  N 

Given  the  subspaces  Z  ,  we  let  F  denote  the  orthogonal  projection  of  Z 
onto  ZN;  that  is,  for  any  z  €Z,  P^z  is  that  unique  element  in  ZN  defined  by  the 

relationships  <P^z-z,Bj>  =  0,  j  =  0,1,..., N.  Define  approximates  AN  to  A  by 

N  N  N  N 

A  (q)  =  P  A(q)P  ;  these  are  bounded  operators  in  Z.  Let  {S  (t) }  be  the 

semigroup  generated  by  AN,  l.e.  S^(t;q)  =  exp (A^(q)t }  —in  this  case,  this 

N 

exponential  definition  has  its  usual  power  series  definition  since  A  is 
bounded. 

We  use  these  constructs  to  approximate  (6)  by 
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(8) 


zN(t;q)  =  SN(t;q)PN¥  +  f  SN(t-a;q)PNG(o)da. 

J0 


or,  equivalently,  we  approximate  (5)  by 


(9) 


iN(t)  *  AN(q)zN(t)  +  PNG(t) 
zN(0)  =  pNy. 


The  associated  fit-to-data  criterion  is  then  taken  as 


(10) 


tN,  . 
J  (q) 


2 

> 

2 


and  the  sequence  of  approximating  parameter  estimation  problems  can  be  simply 

stated:  Minimize  J^Cq)  over  Q  subject  to  (8)  or  (9).  Before  discussing 

convergence  properties  of  solutions  to  these  problems,  we  explain  how  one  can 

easily  Implement  these  approximate  estimation  problems.  We  summarize  the 

more  complete  discussions  given  in  [5,  p.  10-11;  p.  22-24]. 

h,  .  -  _N 

We  first  note  that  any  z  v*]  t  l  (and,  in  particular,  any  solution  of 

N  ^  N  N 

‘(8)  or  (9))  can  be  written  as  z  (t)  =  T  w.(t;q)B.  for  appropriately 

N  *s°  3 

chosen  real  coefficients  w^(t;q).  it  is  also  easily  seen  that  (9)  is 
equivalent  to  the  Galerkin  system  of  equations 


(ID 


<zN(t),B^>  =  <A(q)zN(t)»B?>  +  <G(t),B^> 

<zN(0),B?>  *  <T,B^>,  j  =  0,1,..., N. 


If  we  substitute 


N 


z 


into  (11),  then  we  obtain  the  mattix  system 
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(12) 


QNwN(t)  =  KNwN(t)  +  RN(G(t)) 
QNwN(0)  *  RN('?) 


where  Q  , 


K  are  (N  +  1)  sc  (N  +  1)  matrices  with  elements 


(13) 


<?.  =  <bn,bn> 

ij  l  3 

=  <B?.A(q)B»>, 


and  R  ,  w"  are  N  +  1  vectors  given  by 


(14) 


RN(*) i  =  <y,B^> 

wN  *  col (Wq.Wj , . . . .mg) . 


Thus,  to  solve  the  approximate  estimation  problems,  one  deals  with 

vector  systems  of  ordinary  differential  equations.  More  precisely,  for  a 

given  index  N  of  approximation,  one  minimizes  (10)  iteratively,  using  (12) 

to  compute  *  (t;q)  (and  hence  z  (t;q))  for  each  value  of  q  in  the 

N 

iterative  procedure.  The  matrices  Q  are  seven-banded  and  symmetric  in  this 
U 

case  while  the  K  are  seven-banded,  in  general  unsymmetric,  and  involve  the 

unknown  parameters  q^,  q^,  q^.  The  iterative  procedure  we  have  used  with  great 

success  when  minimizing  (10)  is  the  Levenberg-Marquardt  algorithm  (a 

modified  Gauss-Newton  type  routine)  as  packaged  in  the  IMSL  routine  ZXSSQ. 

Either  an  IMSL  package  (DGEAR)  employing  Gear's  variable  order,  variable 

atep  method  for  stiff  systems  or  an  IMSL  package  (DVERK)  for  a  variable  order, 

variable  step  Runge-Kutta,  was  used  to  solve  (12)  at  each  step  in  the 

Levenberg-Marquardt .  (An  implementation  of  the  Cholesky  algorithm  is  used 

1 1 

to  solve  equations  of  the  fora  Q  x  ■  y  for  x.) 
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Let  Q  *  (<li  *^2  <*enote  a  soluti°n  to  problem  of 

minimizing  (10)  for  a  given  fixed  N  (assuming  for  the  present  that  such 

solutions  exist).  Our  goal,  of  course,  is  to  obtain  a  sequence  of 

estimates  (either  q  or  some  subsequence)  that  converges  as  N  -►  •  to  an 

estimate  ~q  that  will  be  a  solution  of  the  minimization  problem  involving 

(7)  (or  equivalently  (3)).  We  shall  establish  such  a  convergence  result 

N  N  * 

through  a  series  of  results  below.  We  first  argue  that  z  (t;q  )  -*■  z(t;q  ) 

N  * 

when  q  is  any  sequence  converging  to  q  in  an  appropriate  manner.  We  then, 
under  reasonable  compactness  hypotheses  on  Q,  the  set  of  admissible 
parameters,  argue  that  some  subsequence  of  (q  }  (where  q  is  a  solution 
of  the  Nth  approximate  estimation  problem)  converges  in  this  manner  to  a 
limit  parameter  q  in  Q  that  is  a  desired  optimal  estimate  for  the  original 
problem  for  (7).  We  begin  this  program  with  the  following  fundamental 
convergence  statement . 

Theorem  4.1.  Suppose  qN  -  (q^,  q^,  q^)  is  any  sequence  in  Q  n([a,b]xW^xLw) , 

0  <  a  <  b  <  «,  satisfying  Jq^L* l0^** are  bounded  with  *  q** 

N  *  *  *  2 

q^  -*■  q^  in  L^,  i  =  2,3.  Furthermore,  assume  q2>q3  €  Then 

N  N  ^  N 

I*  (t;q  )  -  z(t;q  )|2  -*■  0  as  N  ■*  «,  where  z  ,z  are  given  by  (8)  and  (6)  resp 

A  convenient  tool  to  be  used  in  establishing  this  theorem  is  a  version 

of  the  Trottar-Kato  approximation  theorem  from  linear  semigroup  theory.  We 

H 

state  and  use  here  a  simple  version  (see  [5];  in  particular  take  *  "I  and 
in  Prop.  2.1  of  that  reference).  For  other  versions  see  [15],  [8], 

[5]  and  the  references  given  there. 

Theorem  4.2.  Let  iff  be  a  Hilbert  space  and  suppose  T^(t),  T(t)  are  Cg-sami- 
groups  on  degenerated  by  linear  operatorsflf**,  sd respectively.  If  (i) 

(stability)  there  exist  constants  M  and  B  independent  of  H  such  that 
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(I  <.  (11)  (consistency)  there  exist  a  subset  0of 

dense  In  tt(,  with  0  c:  Dom(af)  and  dense  In  if  for  soae  XQ  >  0 

and  for  z  €  0  we  have  4^2 -jfz  |  ->0  as  N 
then 

ll^CtDz-TCDzi  ■*■  0  as  N  -  for  all  z 
and  the  convergence  Is  uniform  in  t  on  compact  subsets  of  (0,  •), 

We  begin  the  proof  of  Theorem  4.1  by  supposing  that  {q^ }  is  given 
as  stated  in  that  Theorem  and  then  choosing*^  =  A^fq^) and of=  A(q  )  in  Theorem  4.2. 
Here,  of  course  A  ,  A  are  as  defined  in  section  3  and  above;  we  thus  know 
thatfll**  and  Qff'  generate  semigroups  T^ft)  -  SN(t;qN)  and  T(t)  ■  S(t;qA) 
respectively.  We  first  use  Theorem  4.2  to  establish  that  SN(t;qN) Z  ■*  S(t;q  )z 
for  each  z  €  Z. 

To  verify  the  stability  condition  (1)  of  Theorem  4.2,  we  observe  that, 
since  Ifljl*  ere  bounded,  one  has  in  view  of  Lemma  3.1, 

<AN(qN)z,z>  =  <PNA(qN)PNz,z>  -  <A(qN)  PNz,PNz> 

<  w{qN)  <  PNz,PI,z>  <  u(qN)  <*,z> 

£  . 8<z,z> 

for  0  appropriately  chosen,  independent  of  N.  It  thus  follows  from  standard 
arguments  (e.g.  see  the  discussions  in  [15,  p.  16-22])  chat  an  exponential 
bound  as  in  (1)  holds. 

Mart  we  turn  to  condition  (11)  and  observing  that  A(q*)  is  the 

2  A 

infinitesimal  generator  of  a  Cq -semigroup,  we  note  that  0  s  Dom(A  W  )) 

is  dense  in  Z  (see  [IS,  p.S}).  Clearly  0  c  Dorn  (A  (q*)) ,  and 

for  A_  >  w(q  ),  we  have  that  the  resolvent  operator  R.  (A(q*))  »  [An-A(q*)] 
v  A0  .  " 

*  exists.  For  p  €  Don(A(q*))  we  find  R,  (A(q*))p  €  Dow(A2(q*))  «  0.  Thus 

0 

for  any  p  €  Dom(A(q*))  the  equation  [A0-A(q*)]p  »  p  is  solvable  for 

♦  €  0  (just  take  ♦  -  Rx  p).  Hence  (AQ-A(q*))0  »  Dom(A(q*))  so  that 

-A  •  J  0 
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(X0-A(q*))<?  is  dense  In  Z.  To  satisfy  (ii)  it  remains  to  demonstrate 
that  AN(q**)z  ■*  A(q*)z  for  z  in  the  set  3  just  defined.  We  state 
this  as  a  lemma  and  defer  detailed  arguments  to  Appendix  2. 

Lemma  4.1.  For  z  €  &  = Don(A2(q*)) ,  we  have 

|AN(qN)z-A(q*)z|2  •*>  0  as  N  -*■  «>. 

Having  verified  the  hypotheses  of  the  Trotter-Kato  theorem  in  the  case 

N  N 

of  interest  to  us  here,  we  thus  have  S  (t;q  )z  -*■  S(t;q*)z  for  z  6  Z, 
uniformly  in  t  on  compact  intervals  and  this  holds  for  any  sequence 
q  q*  satisfying  the  hypotheses  of  Theorem  4.1. 

To  complete  the  proof  of  Theorem  4.1,  we  use  (6)  and  (8)  to  write 
(again  | • |  denotes  |*|2) 

|zN(t;qN)-z(t;q*)  |  ±  |pVf|  +  | J*[SN(t-o)PNG(o)  -  S(t-a)G(o)]do| 
where  SN(t)  ■  SN(t;qN),  S(t)  ■  S(t;q*).  Thus  we  have 


|zN(t;qN)-z(t;q*)|  <  |PNf-f|  ♦  ft|SN(t-«r)  [PNG(o)  -  G(o)j|do 

;0 

♦  f  |[SN(t-o)-S(t-o)]G(o)|do 
J0 

<  |pV*|  ♦  MeStf  |PNG(o)-G(o)|do 
t  '° 

♦  f  |[SN(t-o)-S(t-o)]G(o)|do. 

J0 


Each  of  these  terns  -*■  0  as  N  «  from  the  convergence  properties  of 
P*  and  SN(t)  already  established  plus  the  dominated  convergence  theorem. 
This  completes  the  arguments  establishing  Theorem  4.1. 
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We  turn  finally  to  explain  how  the  convergence  results  of  Theoren  4.1 
can  be  used  to  obtain  desired  results  for  our  paraaeter  estimation  problems. 

We  first  place  restrictions  on  the  admissible  parameter  set  Q.  Let 
0  <  a  <  b  <  «,  let  B2  be  a  bounded  subset  of  W*  (i.e.  there  exists  K 
such  that  q2  €  B2  implies  I^L  1  K  I^L  —  and  Bj  be  a 

bounded  subset  of  L  .  We  assume 

QO 

H(a)  Qc  {q-(qj,q2,q3)  €  R1  x  W*  x  W^acq^b,  q2  €  B2»q3  6  Bj), 

H(b)  Q  is  compact  in  the  R1  x  L2  x  L2  topology. 

Consider  now  the  functional  J**  defined  in  (10)  where  zN(t;q)  * 

N 

l  w“(t;q)B?  is  defined  via  (12), (13), (14).  Noting  that  K?  « 
i«0  1  1 

^.q/ej  +  <l2DBj  +  ^3®^  depends  continuously  on  q  in  the  R1  x  L2  x  L2 

||| 

topology,  one  sees  that  it  is  not  difficult  to  argue  that  q  -*■  z  (t;q),  and 

hence  q  -►  J^q),  are  continuous  in  the  same  sense.  Thus  from  the  compact- 

ness  assumption  (b)  on  Q  we  see  there  exists  q  €  Q  that  is  a  solution  to 

the  problem  of  minimizing  JN  over  Q,  N  ■  1,2,...  . 

The  sequence  (q^)  thus  obtained  is  in  the  compact  set  Q  and  hence  we 

-Nk 

can  extract  a  subsequence  (q  )  converging  to  some  limit  parameter  q  in  Q. 
We  claim  that  q  is  a  solution  to  the  problem  of  minimizing  (7)  subject  to 
(6).  To  see  this,  we  first  observe  that  by  definition 

N  )|  M 

(15)  J  *(q  k)  <  J*(q)  for  all  q  €  Q. 

2  \ 

Since  q2,q3  €  W‘,  we  have  bylheorem  4.1  that  s  (t^jq  )  z(tjjq)  as 

N 

Nj'  ».  Furthermore,  that  same  theoren  with  q  2  q  for  all  N  yields 
z^t^q)  z(ti;q)  for  any  q  €  Q.  Recalling  (10)  and  talcing  the  limits  in 
the  inequality  (15),  we  obtain  J(q)  <_ J(q)  for  any  q  6  Q;  i.e.,  q  is  a 
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minimi zer  for  J.  We  suamarize  our  findings  in  a  formal  statement. 

Theorem  4.3.  Assume  that  Q  satisfies  the  hypotheses  H(a),H(b).  Then 

solutions  q**  to  the  problea  of  ainimizing  exist  and  there  exists  a 
-Nk  1 

subsequence  (q  >  converging  in  the  R  *  1^  *  topology  to  a  solution 
q  of  the  problea  of  minimizing  J  given  in  (7). 

We  conclude  this  section  with  several  renarks  on  the  above  discussions. 
First  note  that  we  only  obtain  (theoretically)  convergence  of  soae  subsequence 
of  the  approximate  estimates.  In  actual  practice  we  almost  always  have  found 

M 

that  the  sequence  {q  }  itself  converges.  One  can  prove  that  this  stronger 
statement  is  true  in  the  case  that  the  original  estimation  problem  (for  (7)) 
has  a  unique  solution  -  a  situation  unhappily  rarely  encountered  with  real 
data  and  a  sophisticated  model  involving  a  partial  differential  equation. 

The  theory  developed  above  extends  easily  to  the  case  where  one  wishes 
to  also  estimate  the  boundary  conditions  (e.g. ,  the  brain  transport  example 
below)  and/or  initial  conditions  (e.g.,  the  insect  dispersal  example  below). 
For  ease  in  exposition  we  have  not  treated  these  cases  directly  in  our  theory 
sketched  here;  the  theoretical  ideas  are  the  same  in  these  cases  (albeit 
the  technical  arguments  are  slightly  more  involved)  as  the  interested  reader 
can  ascertain  by  consulting  [5] , [8] . 

Finally,  as  with  most  "theorems"  in  applied  mathematics,  the  conclusion 
of  Theorem  4.3  is  valid  in  many  situations  where  the  smoothness  hypotheses 
(e.g.,  H(a))  of  the  theorem  are  not  satisfied.  We  have  numerous  computational 
exaaples  on  which  the  methods  perform  well  (i.e. ,  converge)  even  though  the 
coefficients  are  not  smooth.  Indeed,  in  the  insect  dispersal  example  below, 
we  have  f  if,  so  strictly  speaking,  Theorem  4.3  is  not  applicable.  But 


u  we  shall  see,  the  estiaation  scheaes  perform  admirably.  In  this  par¬ 
ticular  instance,  one  can,  at  the  expense  of  a  great  increase  in  technical 
tedium,  modify  the  arguments  in  this  paper  to  actually  establish  convergence. 
However,  in  a  number  of  other  areas  of  applications,  we  have  used  our  methods 
successfully  even  when  we  cannot  establish  convergence  theorems  for  the 
particular  class  of  equations  under  investigation. 
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15.  Applications  to  biological  system:  brain  transport  and  insect  dispersal 
Problems 

In  this  section  we  apply  the  spline  techniques  to  questions  of  biological 
interest.  In  particular,  by  using  these  techniques  in  conjunction  with  experi¬ 
mental  data  we  identify  convection  and  diffusion  terms  for  a  brain  transport 
system  and  for  a  population  of  dispersing  insects.  Proceeding  heurlstlcally, 
we  examine  the  Identified  parameters  in  order  to  gain  insight  about  underlying 
biological  mechanisms  or  to  suggest  further  experimentation. 

A.  Testing  the  methods  with  "known"  numerical  data 

Before  applying  our  methods  to  real  experimental  systems,  we  tested  their 
performance  against  "data"  generated  by  a  known  diffusion  and  convection 
equation.  Our  intent  was  to  investigate  practical  Issues  such  as  amount  of 
data  required,  accuracy  of  method  and  computational  hazards.  In  these  tests 
we  also  considered  a  similar  (in  spirit)  approximation  method,  which  uses 
modal  (eigenfunction)  basis  elements  (see  [2]).  This  allowed  ue  to  compare 
two  algorithms  that  share  a  common  purpose,  but  that  may  differ  in  their 
effectiveness.  Since  detailed  discussions  of  our  findings  can  be  found  in 
[20],  we  summarize  those  results  only  briefly  below. 

We  consider  the  example 

u^  *  q,u  +  q„u  t  >  0,  0<x<l 

t  u  xx  M2  x  —  — 

(16) 

u(t,0)  ■  cQ 
u(0,x)  *  $(x) 

2 

where  d(x)  »  -2x  +  x  +  1.  We  ran  tests  on  this  sample  with  either 

A 

Dlrlchlet  (u(t,l)  =*  0)  or  Neumann  1)  *  0)  boundary  conditions  at 

the  right  boundary  x  «  1. 

"Data"  for  our  tests  were  generated  in  the  following  manner: 
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,  *  *  *  *  * 

Fixed  values  for  q^,  qj  end  cq  were  chosen  (e.g.  q^  *  .3,  q,  •  1.75, 

c*  *  1.0)  end  en  infinite  series  technique  (Independent  of  any  of  the  aethods 

A 

being  tested)  was  used  to  generate  numerical  solution  values  uCt^Xj)  at 
points  (ti,*j),  i  -  1,  2,  ...  ,  I,  j  -  1,  2,  ...  ,  J  in  (0,®)  x  (0,1). 
Either  these  values  alone  or  in  some  cases  these  values  with  noise  added 


(via  a  packaged  random  noise  generator)  were  used  as  data  y ^  In  the  criterion 
function  J  of  (A)  and  its  associated  approximation  3®  with  u  replaced  by  i/*. 

In  general,  the  spline  based  techniques  discussed  in  this  paper  proved 
superior  to  the  modal  techniques.  For  problems  with  homogeneous  Dlrichlet 
boundary  condition  at  x  -  1,  we  first  assumed  that  cq  is  known  and  attempted 
to  estimate  and  q2  in  (16),  given  varying  amounts  of  data.  For  I  ■  1, 

J  ■  3  (one  time  observation  with  three  spatial  points)  the  spline  method 

-8 

produced  correct  converged  estimates  (for  example,  at  N  ■  8,  estimates  q"  - 
.3001  and  q8  -  1.7486  and  residual  sum  of  squares  (RSS)  J8(q8)  *  >69  *  10~9 
were  obtained)  while  the  modal  techniques  failed  to  produce  a  numerically 
convergence  sequence  of  estimates.  For  I  -  2  or  3  and  J  »  3  (two  or  three 
time  observations,  each  involving  three  spatial  points)  both  methods  yield 
converged  values;  however,  in  these  cases  the  spline  based  method  appears 
to  be  more  accurate  (smaller  RSS)  and  mors  efficient  (computationally). 
Furthermore,  the  addition  of  an  extra  time  observation  (I  ■  3  vs.  1*2)  did 
not  improve  the  fit  of  the  model.  We  then  investigated  the  effects  of  increas¬ 
ing  the  number  J  of  spatial  points  in  the  data  sets.  A  general  finding  was 
that  there  existed  a  minimum  number  of  spatial  points  necessary  for  the  methods 
to  yield  good  parameter  estimates  (J  ■  3  sufficed  for  the  spline  scheme  while 
J  -  4  was  required  for  the  modal  method).  Beyond  this  minimal  number,  extra 
spatial  observation  points  did  not  necessarily  Increase  the  efficiency  of 


the  methods.  This  means  that  for  a  given  experimental  system  and  its 
associated  model,  it  may  be  possible  to  identify  the  number  of  data 
points  required  for  accurate  parameter  identification.  When  the  process 
of  gathering  data  is  expensive  or  time  consuming,  we  therefore  suggest  that 
our  parameter  Identification  methods  may  provide  guidance  in  deciding 
upon  the  number  of  time  periods  or  spatial  points  that  need  to  be  sampled. 

Finally,  we  turned  to  the  full  problem  of  estimating  all  three 
parameters  (q^.,  q^,  cq)  in  (16).  Our  findings  were  quite  similar  to  those 
just  summarized.  For  the  spline  based  scheme,  one  time  observation  with 
three  spatial  points  (1*1,  J  ■  3)  were  sufficient  data  to  produce  conver¬ 
gence  to  correct  parameter  values.  Whereas  taking  an  extra  time  observation 
(1  -  2)  does  not  generally  produce  better  estimates  using  the  spline  method, 
in  some  cases  it  does  if  one  is  using  the  modal  technique. 

We  also  examined  the  performance  of  the  spline  scheme  with  Neumann 
boundary  conditions  and  equation  (16).  Again,  the  method  performed  well 
in  estimating  all  three  parameters  q^,  q2*  cQ,  given  data  sets  corresponding 
to  I  •  1,  J  ■  3.  Slightly  better  parameter  estimates  were  obtained  with 
I  »  2  as  opposed  to  I  ■  1,  no  matter  the  value  of  J  (J  ■  3,  4,  5,  6).  For 
fixed  I  ■  1  or  2,  estimates  based  on  3  or  4  spatial  points  were  as  good  as 
those  obtained  from  5  or  6  spatial  points. 

In  summary,  the  tests  of  the  cubic  spline  scheme  ws  carried  out  on  the 
model  (16)  persuade  us  that  the  method  proposed  can  be  usod  with  a  good  deal 
of  numerical  confidence  with  regard  to  fitting  data  to  models  of  the  form  (2). 
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B.  Understanding  brain  fluid  transport 

A  primary  question  concerning  transport  mechanism  in  brain  tissue 
is  whether  diffusion  alone  or  diffusion  and  convection  are  responsible  for 
transport  in  gray  and  white  matter  ( 1123J173J183 ) •  Mathematically,  we  can 
view  this  as  a  question  of  determining  the  magnitude  and  thus  contribution 
of  V  (convection)  and  D  (passive  diffusion)  in  the  equation 

2 

9u  v3u  n8  u 

3t  ax 

where  u  represents  the  concentration  of  a  substance  being  transported  in 
the  brain.  To  investigate  this  problem  we  have  used  our  spline  techniques 
with  experimental  data  kindly  provided  by  Kyner,  Rosenberg  and  associates 
([17]J 18]).  The  data  consist  of  laboratory  measurements  of  u  values  at 
various  locations  in  the  tissue  at  a  fixed  time.  These  measurements  were 
obtained  from  experiments  (described  in  {17] [IB])  using  adult  cats. 

Artificial  cerebrospinal  fluid  containing  labeled  sucrose  was  perfused  into 
each  cat's  lateral  ventricle.  At  the  end  of  the  perfusion  period,  the 
animals  were  sacrificed  and  their  brains  were  rapidly  removed  and  frozen. 
Samples  of  gray  and  white  matter  along  a  direction  perpendicular  to  the 
ventricular  surface  (which  will  be  the  x-axla  in  our  model)  were  removed, 
serially  sectioned  and  analyzed.  From  measurements  of  radioactivity,  the 
average  concentration  of  sucrose  in  each  slice  was  determined,  yielding  data 
which  corresponds  to  observation  at  a  fixed  time  t ^  (t^»  1,  2, or  4  hrs.  for 
the  cat  experiments).  From  this  data  {u(t^, Xj)}  for  the  concentration 
u,  the  transport  of  sucrose  in  gray  natter  can  be  compared  with  that  in 
white  matter. 

To  analyte  these  data  we  used  the  cubic  spline  procedures  to  estimate 
parameters  representing  diffusion  (q^  in  (16)X  convection  (q2)  and  the 
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concentration  (cq)  at  the  boundary  x  ■  0.  In  Table  1  we  have  extracted 
typical  results  of  these  analyses  from  a  more  detailed  report  by  Sives  and 
Sato  [20].  To  Interpret  these  results  we  have  examined  the  predicted 
concentrations  as  though  they  were  obtained  from  a  least-squares  regression 
approach  and  then  analyzed  the  data  with  F-statlstics  (see  [22]).  Of 
course  our  model  is  not  a  simple  curvilinear  regression  equation  since  it 
is  a  dynamic  model,  but  the  parameters  were  estimated  using  a  least-squares 
minimization  routine.  This  approach  then  focuses  on:  the  total  variation 
in  data  (total  sums  of  squares  or  TSSQ) ,  the  variation  explained  by  the 
model  (explained  sums  of  squares  or  ESSQ),  and  the  sum  of  squares  error 
between  the  model's  prediction  and  the  data  (the  residuals  or  unexplained 
variation,  denoted  RSS).  This  application  of  7-statistics  is  not  strictly 
appropriate  because  we  do  not  know  anything  about  the  distribution  of 
residual  errors;  nonetheless,  it  provides  a  quantitative  measure  of  the 
performance  of  different  parameter  estimates  and  is  couched  in  terms  that 
facilitate  comparisons  between  models  (e.g. ,  explained  and  unexplained 
variation).  The  degrees  of  freedom  were  selected  in  the  following  manner 
(following  the  conventions  and  notation  set  out  in  [22]):  explained  df  * 
the  number  of  parameters  estimated  by  our  spline  technique  (analogous  to 
the  number  of  terms  used  in  polynomial  regressions) ,  total  df  *  number  of 
data  points,  and  unexplained  df  ■  (total  df)  -  (explained  df) . 

From  Table  1  two  conclusions  are  striking: 

1)  both  models  (diffusion  alone,  or  diffusion  plus  convection)  explain  an 
enormous  portion  of  the  variation  in  the  data  (a  minis um  of  97*'*), 
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2)  Che  differences  between  che  success  of  Che  two  models  are  negligible 
(at  most,  improving  the  2  of  explained  variation  from  972  to  99%). 

In  Table  1  and  in  all  of  the  analyses  performed  by  Sives  and  Sato  [20], 
the  "diffusion  plus  convection"  model  always  explained  more  of  the  variation 
than  did  the  "diffusion  alone"  model.  The  addition  of  a  convection  term 
often  resulted  in  dramatic  reductions  in  unexplained  variation  (or  residual 
error);  for  example,  in  Table  1  we  see  that  for  data  set  7,  inclusion  of 
convection  reduced  the  RSS  from  22.7  to  7.8.  However,  because  both  models 
were  consistently  so  successful,  it  is  difficult  to  establish  that  one  is 
significantly  better  than  the  other.  When  we  calculated  F  statistics  for 
the  improvement  of  explained  variation  by  moving  from  the  2-parameter 
diffusion-alone  model  to  the  3-parameter  diffusion-and-convection  model, 
our  F  statistics  never  attained  the  p  <  .05  level,  and  were  at  the  p  <  .1 
level  in  only  one  instance. 

Clearly,  the  cubic  spline  methods  yield  parameter  estimates  that  perform 
exceedingly  well  in  describing  the  data.  This  reinforces  our  faith  in  the 
methods.  Unfortunately,  we  are  not  able  to  answer  the  initial  question 
about  the  relative  importance  of  convection  in  brain  transport.  The 
consistently  better  (albeit  only  slightly)  performance  of  diffusion  plus 
convection  models  temptingly  hints  at  the  role  of  convection.  Our  inability 
to  resolve  the  issue  of  convection  cannot  be  blamed  on  the  parameter 
identification  methods.  Instead,  we  argue  from  Table  1  and  similar  analyses 
that  it  is  apparent  that  data  must  be  obtained  for  more  than  one  time 

I 

point  after  perfusion.  When  only  one  concentration  profile  is  available, 
there  is  too  much  freedom  for  juggling  combinations  of  D  and  cq  or  of  D, 

V  and  cq  such  that  the  data  are  "fit."  Our  analyses  point  out  that  the 
experiments  need  to  be  modified  in  order  to  assess  the  importance  of 
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convection.  The  addition  of  another  time  period  is  feasible  by  changing 
the  label  during  the  course  of  the  experiment  (Kyner  and  Rosenberg,  pers. 
comm.).  Indeed,  as  we  shall  see  in  the  following  section,  data  taken  at 
two  time  periods  allow  us  to  Identify  the  importance  of  "convection"  terms 
in  models  for  populations  of  dispersing  insects. 

C.  Modeling  insect  movement  in  cultivated  gardens 

Since  Skellam’s  [21]  pioneering  work  in  1951,  diffusion  models  have 
been  used  to  model  animal  dispersal.  Unfortunately,  most  of  this  modeling 
has  proceeded  independently  of  data  ([13],  [14]).  In  fact,  some  researchers 
have  suggested  that  the  paradigm  of  diffusive  flux  is  inappropriate  for 
animal  movement  and  have  advised  Instead  a  purely  descriptive  regression 
approach  to  quantifying  dispersal  ([24],  [25]).  One  of  the  problems  with 
previous  diffusion  models  is  that  only  the  simplest  process,  that  is  pure 
passive  diffusion,  lends  itself  to  tests  with  experimental  data  (see  [14]). 
Recently  one  of  the  authors  exhaustively  applied  passive  diffusion  models 
to  the  movement  of  two  common  flea  beetles,  Phyllotreta  cruciferae  and 
Phyllotreta  atriolata  [11].  Although  the  models  provided  a  good  description 
of  beetle  movement  in  some  cases,  several  experimental  results  clearly  did 
not  conform  to  simple  passive  diffusion  (see  [11]).  We  subsequently  applied 
spline  parameter  identification  methods  to  these  beetle  data  with  a  model 
extended  to  include  a  spatially-varying  convection  term. 

Using  mark-recapture  experiments,  beetle  movement  was  studied  in 
experimental  linear  arrays.  These  arrays  were  1  m  x  80  m  cultivated  strips 
in  the  middle  of  dense  goldenrod  stands,  each  array  containing  patches  of 
one  of  the  beetles  favored  foodplants,  collards.  Since  the  goldenrod 
field  surrounding  each  array  contained  no  foodplants  for  the  beetles. 
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the  beetles  tended  to  move  only  In  one  dimension  —  up  and  down  the 
linear  arrays  along  the  80  m  axis.  Details  about  the  marking  and 
recapture  procedure  and  experimental  design  are  described  in  [11}. 

The  important  point  Is  that  marked  flea  beetles  were  recaptured  anywhere 
from  1  hour  to  3  days  after  their  release  in  the  experimental  arrays; 
these  recapture  distributions  represent  the  data  that  we  seek  to 
describe  with  a  diffusion-convection  model.  The  model  we  examined  is 


~  TZ  (VOOu)  +  D 


0<  x<  1,  t  >  0 


with  u(0,x)  (which  represents  the  initial  distribution  of  marked  beetles)  known 
and  u(t, 0)  -  u(t,l)  -  0.  Here  the  linear  arrays  have  been  rescaled  to 
fit  in  the  (0,1)  interval.  On  this  rescaled  interval  the  cultivated  strips 
extended  between  .1  and  .9  and  the  actual  sampling  points  (recapture  stations) 
are  evenly  spaced  between  approximately  .20  and  .80.  The  center  of  each 
experimental  array  thus  corresponds  to  x  -  .5.  The  negative  yu  term  in 
(17)  represents  beetles  that  disappear  from  the  system  either  because  they 
die  or  engage  in  long  distance  migration  (both  processes  are  negligible 
over  the  short  time  scale  of  our  experiments  but  would  become  important 
over  longer-running  time  periods).  We  have  considered  several  different 
V(x)  functions  and  combinations  of  V(x)  with  spatially  varying  y(x)  terms. 

In  a  separate  report  we  will  synthesize  these  analyses  to  dissect  differences 
between  beetle  species  and  to  quantify  the  influence  of  crop  spacing  on 
the  movement  process  (Kareiva  and  Banks,  in  prep.).  Our  ultimate  goal 
is  to  describe  the  changes  in  density  (u)  of  beetles  through  time  and 
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space.  In  this  paper,  our  more  limited  goal  is  to  demonstrate  the 
application  of  the  spline  techniques  to  insect  dispersal  data,  and  to 
point  out  some  of  the  problems  of  such  analyses.  To  do  this  we  have 
selected  a  small  subset  of  our  analyses  for  illustrative  purposes. 

Technically  the  spline  approximation  scheme  was  successful  in  two 

ways: 

1)  it  often  identified  combinations  of  V(x)  and  D  in  equation  (17) 
that  predicted  beetle  distributions  in  close  accordance  to 
observed  distributions  (see  Table  2,  example  5.2) 

2)  it  identified  convection  terms  that  significantly  reduced  RSS 
relative  to  diffusion-alone  models  (again  see  Table  2, 
example  5.3). 

It  is  important  to  note  that  the  addition  of  convection  significantly 
reduced  RSS  while  using  only  the  initial  data  and  one  time  period  after 
that  Initial  data.  This  is  in  marked  contrast  to  the  case  involving 
brain  transport,  where  the  performance  of  diffusion-alone  versus 
diffu8lon-and-convectlon  models  could  not  be  distinguished.  We  were  able 
to  use  results  of  the  beetle  dispersal  experiments  to  examine  the  differences 
between  transport  processes  with  and  without  convection  because  the  initial 
date  in  these  experiments  was  known  and  fixed.  Consequently,  the  only  un¬ 
identified  parameters  influencing  the  fit  of  the  model  to  data  are  diffusion 
and  convection  parameters .  These  positive  statements  are  balanced  below 
by  some  cautionary  tales  concerning  the  problems  we  had  analysing  Insect 
dispersal  data. 

One  of  the  difficulties  uncovered  by  our  analysis  was  that  a  wide 
variety  of  different  convection  functions  yielded  low  RSS’s.  Moreover, 
the  convection  functions  that  worked  best  represent  functions  that 
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contradicted  our  initial  biological  hypotheses.  In  particular,  V(x) 
functions  corresponding  to  no  convection  near  x  ■  .5  and  biased  motion 
out  the  ends  of  arrays  (away  from  the  center  of  the  gardens)  yielded 
the  lowest  RSS  values.  This  contradicts  our  initial  hypothesis  that 
there  should  be  convection  near  the  ends  of  each  array  back  towards 
the  central  position  (toward  x  ■  .5).  At  this  stage,  however,  because 
there  are  so  many  reasonable  possibilities  for  V(x)  that  we  have  not 
examined,  we  are  reluctant  to  draw  any  firm  conclusions  about  the  shape 
of  V(x)  for  these  beetle  experiments.  Note  that  the  spline  methods  as  we 
have  used  them  here  do  not  magically  reveal  the  shape  of  functions  such  as 
V(x)  —  they  only  estimate  the  constant  parameters  in  an  assumed  functional 
form.  We  are  also  cautious  because  we  feel, in  retrospect, that  the  beetle  mark- 
recapture  experiments  are  not  well  suited  for  identifying  convection  functions. 
This  unsuitability  results  from  the  release  of  all  beetles  in  one  position, 
and  the  fact  that  subsequent  recaptures  tended  to  overrepresent  the 
middle  regions  of  arrays  and  underrepresent  the  peripheral  reaches  of 
each  array.  Note  that  the  shortage  of  recaptures  near  the  periphery 
could  be  explained  by  either  of  two  opposite  convection  processes: 

1)  only  a  few  beetles  are  caught  away  from  the  center  because  convection 
towards  the  center  prevents  their  outward  Bpread,  or  (li)  only  a  few 
beetles  are  caught  away  from  the  center  because  whenever  they  enter  that 
region,  they  are  rapidly  transported  out  of  the  arrays  due  to  an  outward 
convection.  To  best  identify  convection  processes,  mark-recapture 
experiments  should  begin  with  a  uniform  distribution  of  marked  individuals. 
Changes  in  that  uniform  distribution  could  then  be  used  to  distinguish 
among  different  models.  The  flea  beetle  experiments  suffer  because  too 
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few  individuals  were  observed  in  regions  chat  we  speculated  would  be 
characterized  by  high  convection, 

A  second  major  limitation  of  our  analysis  concerns  its  assumption  of 
constant  parameters  through  time  in  spite  of  the  biological  inevitability  of 
temporal  variation  in  inSect  movement  behavior.  Indeed,  Table  3  includes 
examples  of  what  appear  to  be  temporally  varying  parameters,  that  is 
parameter  estimates  which  vary  widely  using  recapture  data  from  different 
days,  but  Identical  experiments  and  beetle  species.  Only  rarely  were  we 
able  to  find  one  set  of  parameters  that  predicted  several  consecutive 
recapture  distributions.  Because  insects  are  ectotherms  and  are  very 
sensitive  to  weather,  their  movement  behavior  will  vary  from  day-to-day 
as  a  consequence  of  variation  in  weather.  He  are  in  the  process  of 
extending  our  analyses  to  Include  temporal  variation  in  D,  V  and  p 
in  equation  (17).  This  elaboration  is  necessary  if  we  are  to  model 
insect  dispersal  in  extended  field  situations. 

A  final  caution  involves  the  potential  for  obtaining  good  matches 
between  model  and  data  (i.e.,  low  RSS'a),  yet  biological  nonsense. 
Applications  of  these  identification  approaches  should  always  entail 
efforts  to  get  Independent  estimates  of  parameters  as  much  as  possible. 
Otherwise,  what  appears  to  be  numerical  success  might  correspond  to 
biological  absurdity.  For  example,  one  of  the  sets  of  parameters 
that  we  sought  to  Identify  was  D  and  p  in  equation  (17),  holding  V(x)  =  0. 
Doing  this  we  occasionally  obtained  low  RSS's,  even  for  more  than  one 
consecutive  time  period.  But,  the  decay  rates  or  p's  that  were  thua 
identified  were  impossibly  high  —  they  corresponded  to  decay  rates 
five  times  higher  than  anything  ever  observed  for  flea  beetles  in  the 
experimental  arrays. 


6.  Concluding  remarks 


The  spline  techniques  we  describe  In  this  paper  are  very  effective 
at  fitting  particular  transport  equations  to  data.  By  Itself,  this 
parameter  Identification  approach  cannot,  however,  lead  to  correct 
choices  about  what  type  of  transport  equations  are  appropriate  for 
particular  biological  systems.  Data  must  be  collected  and  experiments 
designed  In  special  ways  if  one  wants  to  use  the  parameter  Identification 
approach  to  distinguish  between  different  transport  models.  We 
recommend  experimenting  with  the  spline  methods  before  collecting 
experimental  data.  In  that  way  an  experimental  design  might  be  tailored 
so  that  it  can  extract  the  maximum  Information  from  spline  Identification 
methods.  Factors  such  as  initial  data,  time  schedule  for  collecting 
data  and  the  spatial  sampling  regime  will  all  influence  the  performance 
of  spline  methods.  We  have  found,  for  example,  that  one  of  the  worst 
types  of  initial  data  Is  a  point  release  of  marked  insects,  and  now  we 
plan  to  modify  our  mark-recapture  experiments  hereafter.  Spline 
identification  methods  will  undoubtedly  challenge  biologists  In  their 
Interpretation  of  data.  Most  Importantly  they  allow  us  to  address 
complex  transport  processes  well  beyond  simple  random  diffusion  and 
conventional  numerical  methoda. 


TABLE  2 


Applying  the  spline  parameter  identification  approach  to  Phyllotreta  striolata 

dispersal  in  different  experimental  arrays.  N  *=  number  of  spline  basis 

elements;  %  expl.  ■  %  TSSQ  explained  by  model,  that  is,  it  equals 
TSSQ-RSS 

— x  100;  t  *  time  period  or  periods  at  which  data  were  collected.  In 
all  cases  the  initial  data  are  u(0,x)  *  211.2  for  x  ■  .5  and  0.0  for  x  f  0.5. 
The  initial  spline  for  t  «  0  was  fixed  to  be  the  same  for  all  analyses  below 
-  it  was  the  spline  that  best  approximated  above  initial  data.  Note  that  the 
results  of  any  given  identification  run  may  depend  on  initial  guesses. 


Example  5.1 


Identify  0,  V,  and  u  in 


3u 

3t 


»*-$♦  j:  (V(x-.5)u)  -  vu 

dX 


9m  interpatch  spacing 
N  -  32 

t  «  1  and  3  days 

We  searched  first  for  D,  then  D  plus  it,  then  D,v  and  V 
D  •  20.0  n2/day  %  expl.  ■  74.6% 

it  -  1.9 

V  »  459.2  m/dtty  Fj  1Q  -  9.77 
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Table  2  -  continued 


3m  interpatch  spacing 

N  =  32 

t  =  1  day  only 

D  =  4.0  m2/day  %  expl.  =  99.9% 

V  =  1.35 

V  =  -178.4  m/da y  F  .  =  851.0 

p  <  .001 


Example  5.2 

Identify  D 

and  V  in 

3u 

3t 

"  +  "£% 

(V(15(x-.5))3u) 

9m  interpatch  spacing 

N  = 

22 

%  expl.  *  21% 

t  « 

2  hrs,  3  hrs 

F2,ll  *  J-45 

D  - 

1600  m2/day 

p  <  .5 

V  - 

-2.4  m/day 

Example  5.3 

Contrast  diffusion  alone,  to  diffusion  plus  convection  in 


It  "  D  *  TZ  (V(6.25Cx-.5)Su) 


Table  2 
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continued 


6m  interpatch  spacing 
N  *  22 
t  *  1  day 


Diffusion  alone 
D  ■  2520  m2/day 
%  expl.  *  24% 

F.  _  »  2.19,  p  <  .25 


Diffusion  and  convection 
D  »  240  a2/day,  V  «  -114  m/day 
%  expl.  =  95.6% 

F  *  65.6,  p  <  .001 


Improvement  in  %  expl.  by  adding  convection 


Fj  6  ■  98.4,  p  <  .001 


3a  interpatch  spacing 
N  «  22 
t  «  1  day 

Diffusion  alone  Diffusion  and  convection 

D  ■  2190  a2/da y  D  ■  320  a2/day,  V  «  -43.7  a/day 

%  expl.  ■  31%  %  expl.  »  97.3% 

F.  ,  -  3.12,  p  <  .25  F_  .  *  72.0  p  <  .001 

4,0 


Iaproveaent  in  %  expl.  by  adding  convection 


98.3,  p  <  .001 
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TABLE  3 


Parameter  identification  using  data  from  the  same  experiment  and  beetle  species, 

but  different  time  period  after  release.  As  in  Table  2,  the  initial  spline 

was  fixed  to  provide  the  best  fit  to  u(0,x)  *  211.2  for  x  •  O.S  and  0.0 

for  x  /  0.5.  All  analyses  below  were  run  with  the  number  of  spline  basis 

elements  equal  to  22.  The  "%  expl."  below  refers  to  the  %  of  TSSQ  explained 

TSSO-RSS 

by  diffusion  model,  that  is,  it  equals  — —  x  100. 


Example  5.4 


Identify  D  in 


I.  Phyllotreta  striolata  in  linear  array  with  3m  interpatch  spacing 

(i)  data  from  t  *  1  day 

2 

estimated  D  *  2190  m  /day  %  expl.  ■  31% 

Fj  j  *  3.12,  p  <  .25 
(ii)  data  from  t  *  3  days 

2 

estimated  D  ■  8800  m  /day 

Fj  7  ■  3.63,  p  <  .25  %  expl.  *  34% 

II .  Phyllotreta  striolata  in  linear  array  with  6m  interpatch  spacing 

(i)  data  from  t  *  1  day 

estimated  D  ■  2520  m^/da y 

Fj  7  ■  2.19,  p  <  .25  %  expl.  -  24% 

(ii)  data  from  t  «  3  days 

2 

estimated  D  *  8900  m  /day 

Fj  7  »  2.94,  p  <  .25  %  expl.  »  30% 
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Table  3  -  continued 

III.  Phyllotreta  striolata  in  linear  array  with  9b  interpatch  spacing 

(i)  data  from  t  ■  1  day 

estiaated  D  ■  2330  a^/day 

P.  -  *  .66,  p  <  .5  %  expl.  *  12% 

1 

(ii)  data  froa  t  ■  3  days 

2 

estiaated  D  *  9600  m  /day 

F.  .  -  2.74,  p  <  .25  %  expl.  «  35% 

Note  that  in  none  of  the  examples,  does  diffusion  alone  provide  a  statistically 
significant  description  of  recapture  distribution.  Nonetheless,  the  variation 
in  D's  as  a  function  of  time  is  striking. 


For  the  arguments  to  establish  Lemma  3.1,  we  assume  the  reader  is 

familiar  with  the  theory  of  linear  semigroups  and  dissipative  operators  on 

2 

a  Hilbert  space  (see  [1],[15]).  We  first  define  A^Cq)  =  q^D  and 
Aj(q)  *  q2D  *  q3  so  that  A  *  +  A^.  Here  Dom  (A^)  =  Dom(A)  and  Dom(Aj)  = 

H*.  Next  we  observe  that  AQ(q)  is  maximal  dissipative  in  Z  and  is  thus 
the  infinitesimal  generator  of  a  (^-semigroup  of  contractions  [15,  p.17, 

Thm.  4.5b’ .  The  dissipative  estimate  follows  immediately: 

<AQ(q)z,z>  *  <qJD  z,z>  =  - |qj | I Dz j 2  <  0. 

Indeed,  A^  is  self-adjoint  with  spectrum  in  (-*,0]  so  _£p(Ag-AI)  =*  Z  for 
A  >  0  (see  [19, p.349])  and  hence  Afl  is  maximal  dissipative. 

We  procode  by  considering  Aj(q)  and  demonstrating  that  it  is  a  relative¬ 
ly  bounded  dissipative  perturbation  of  A^(q) .  First  observe  that  for 
4  €  Hg,  an  integration  by  parts  yields 


|  (q2W)4  -  -j  ♦D(q24)  -  -J  4[Dq2+q2D4] 


or 


Thus 


2j(q2D*)*  -  -|oq242. 

<Aj (<!)♦,♦>  »  f(q2Dl^3®^  *  1 1"  ^>q2*q3V 
<  +  lq3^^»2 


and  hence  Aj(q)  -  ul  is  dissipativo  where  u  is  as  defined  in  the  statement 
of  Lemma  3.1. 

2 

Turning  next  to  arguing  relative  boundedness,  we  suppose  ♦  €  Dom(AQ)  =  II  n 
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I Al^l 2  =  |q2D<^>+cl3^! 2  -  iq2D4>^ 2  +  lqjLI*l; 


while  (again  we  integrate  by  parts) 


f 1 1  D4> ! 2  =  -?-[  V  (D*)fc4!)  =  -f-  f^q.fD2*)* 

Jq  o  1  ql  J0 

If  we  consider  this  last  term  as  JT"J(alA0$l)  (j~U|)  and  use  the  fact  that 

12  2  * 
ab  ±  •jCa  +b  ) ,  we  thus  obtain 

*  7?"lAn+l2  ♦  ~ ~  -^UI 


VA»”2  ‘  ^ 


Hence  we  have 


NI2£^IVl2*^l*l2. 


Choosing  a  =  i(«^qp/|q2|w,  we  find 


|q2D^J2  <  |q2ljD^|2<I|A0^2  +  -|rl^2- 


Combining  this  last  estimate  with  (Al) ,  we  finally  obtain 


so  that 


lvl2i  ylvu  *  *  KUW2- 


KAjC,)-.!}*^  £.i|A0*l2  ♦'{wkjli/qj  *  k3l.)|*l2. 
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It  follows  that  Aj(q)  -  wl  is  dissipative  and  relatively  bounded  with 
respect  to  and  satisfies  the  hypothesis  of  [15,p.84,  Thm.  3.1].  Thus 
A(q)  -  wl  =  AQ(q)  +  A^q)  -  wl  is  the  generator  of  a  C0~semigroup  of  con¬ 
tractions  and  hence  A(q)  =  (AQ(q)  +  A^q)  -  wl)  +  wl  is  the  generator  of 

a  CQ- semigroup  (S(t)}  satisfying  ||S(t)j|  <  eBt  (see  [15,  p.80,  Thm. 1.1]). 
This  yields  (see  [15,  p.16,17,  Thm.  4.5(a)])  the  first  two  claims  of  Lemma  3.1. 

That  mild  solutions  of  (5)  are  given  by  (6)  actually  follows  by  definition 

from  the  usual  theory  of  linear  semigroups  and  abstract  evolution  equations 
(e.g.,  see  [1] , [15] ) - 


') 

j 
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Appendix  2 

We  establish  the  veracity  of  Lemma  4.1.  First,  simple  arguments  reveal 

that  whenever  q2>q3  ^  W»*  we  have  ^  H  Dom(A2(q*))  c  H4  ft  H*.  For  if 

4>  €  Dom(A2(q*)),  then  A(q*3 £  Dom(A(q*))  =  H2  0  H^;  that  is,  q^D2^  + 

*  *  2  1 

q2M  +  q^  =  Y  is  in  H  n  H^.  It  follows  immediately  that 

2  *  *  *  2  4  1 

D  <J>  =  l/q1[y-q2Di)i-q3!J>]  is  in  H  or  that  4>  6  H  (we  have  $  £  since 

Dom(A2(q*))  c  Dom(A(q*))  =  H2  ft  hJ) . 

To  complete  the  arguments  for  Lemma  4.1,  we  need  some  estimates  on  the 
N  N 

projection  operators  P  :  Z  -*■  Z  as  defined  above.  These  estimates,  which 

are  themselves  derivable  from  well-known  results  from  the  theory  of  spline 

approximations,  are  given  in  [5,  Lemma  2.3]  and  are  as  follows: 

4  1 

There  exist  constants  c^  such  that  for  z  €  H  ft  HQ, 

|pNz-z(2  <  c0N‘4|d4z|2 
I D(PNz-z) | 2  <  CiN'3|d4z|2 
|D2(PNz-z)|2  <  c2n’2|d4z|2. 

Finally,  for  z  €  9  c  H4  H  hJ  we  consider  the  estimates  (to  facilitate 
notation  we  use  | • |  for  | ' | 2  here) 

|AN(qN)z-A(q*)z|  =  |pNA(qN)PNz-A(q*)z j 

<  |PN(A(qN)PNz-A(q*)z  |  «•  | (PN-I)A(q*)z| 

<  |A(qN)PNz-A(q*)z|  ♦  |  (PN-I)A(q*) z| . 

The  second  term  in  this  last  expression  ■+  0  as  N  -*■  «.  This  follows  since 
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N  N 

P  -*■  I  on  Z  (the  above  estimate  yields  that  P  z  -*  z  for  a  dense  set  of 

N 

z  in  Z  and  the  (P  }  are  uniformly  bounded  on  Z). 

Consider  the  first  term: 

|A(qVNz-A(q*)z!  =  |  (q^D^q^D+q^ PNz  -  (q*D2+q*D+q*) z | 

<  |qjD2(PNz-z)[  4  |(qJ-q*)D2z[ 

4  |q^D(PNz-z)|  4  !(q^-q*)Dz| 

+  jqj(PNz-z)!  4  |(q^-q3)z| 

<  |qil|D2(PNz-z)|2  4  |q*-q*!lD2*|2 

+  !q2l»!D(pNz-z)  l2  +  lq2"q2MDzl» 

-  k^Lk^l,  -  kJ-qJl2l*L* 

But  every  term  in  this  last  sum  o  as  N  *  due  to  the  following  facts: 

kJl.kjU^L  are  bounded,  q^  q*,  q^  ♦  q^  in  L2>  i  =  2,3,  and 

N  N  2  N  2 

P  z,Drz,D  P  z  converge  to  z,  Dz,  D  z,  respectively. 
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